134 research outputs found

    Task-demands can immediately reverse the effects of sensory-driven saliency in complex visual stimuli

    Get PDF
    In natural vision both stimulus features and task-demands affect an observer's attention. However, the relationship between sensory-driven (“bottom-up”) and task-dependent (“top-down”) factors remains controversial: Can task-demands counteract strong sensory signals fully, quickly, and irrespective of bottom-up features? To measure attention under naturalistic conditions, we recorded eye-movements in human observers, while they viewed photographs of outdoor scenes. In the first experiment, smooth modulations of contrast biased the stimuli's sensory-driven saliency towards one side. In free-viewing, observers' eye-positions were immediately biased toward the high-contrast, i.e., high-saliency, side. However, this sensory-driven bias disappeared entirely when observers searched for a bull's-eye target embedded with equal probability to either side of the stimulus. When the target always occurred in the low-contrast side, observers' eye-positions were immediately biased towards this low-saliency side, i.e., the sensory-driven bias reversed. Hence, task-demands do not only override sensory-driven saliency but also actively countermand it. In a second experiment, a 5-Hz flicker replaced the contrast gradient. Whereas the bias was less persistent in free viewing, the overriding and reversal took longer to deploy. Hence, insufficient sensory-driven saliency cannot account for the bias reversal. In a third experiment, subjects searched for a spot of locally increased contrast (“oddity”) instead of the bull's-eye (“template”). In contrast to the other conditions, a slight sensory-driven free-viewing bias prevails in this condition. In a fourth experiment, we demonstrate that at known locations template targets are detected faster than oddity targets, suggesting that the former induce a stronger top-down drive when used as search targets. Taken together, task-demands can override sensory-driven saliency in complex visual stimuli almost immediately, and the extent of overriding depends on the search target and the overridden feature, but not on the latter's free-viewing saliency

    Objects predict fixations better than early saliency

    Get PDF
    Humans move their eyes while looking at scenes and pictures. Eye movements correlate with shifts in attention and are thought to be a consequence of optimal resource allocation for high-level tasks such as visual recognition. Models of attention, such as “saliency maps,” are often built on the assumption that “early” features (color, contrast, orientation, motion, and so forth) drive attention directly. We explore an alternative hypothesis: Observers attend to “interesting” objects. To test this hypothesis, we measure the eye position of human observers while they inspect photographs of common natural scenes. Our observers perform different tasks: artistic evaluation, analysis of content, and search. Immediately after each presentation, our observers are asked to name objects they saw. Weighted with recall frequency, these objects predict fixations in individual images better than early saliency, irrespective of task. Also, saliency combined with object positions predicts which objects are frequently named. This suggests that early saliency has only an indirect effect on attention, acting through recognized objects. Consequently, rather than treating attention as mere preprocessing step for object recognition, models of both need to be integrated

    Predicting human gaze using low-level saliency combined with face detection

    Get PDF
    Under natural viewing conditions, human observers shift their gaze to allocate processing resources to subsets of the visual input. Many computational models try to predict such voluntary eye and attentional shifts. Although the important role of high level stimulus properties (e.g., semantic information) in search stands undisputed, most models are based on low-level image properties. We here demonstrate that a combined model of face detection and low-level saliency significantly outperforms a low-level model in predicting locations humans fixate on, based on eye-movement recordings of humans observing photographs of natural scenes, most of which contained at least one person. Observers, even when not instructed to look for anything particular, fixate on a face with a probability of over 80% within their first two fixations; furthermore, they exhibit more similar scanpaths when faces are present. Remarkably, our model’s predictive performance in images that do not contain faces is not impaired, and is even improved in some cases by spurious face detector responses

    Using binocular rivalry to tag foreground sounds: Towards an objective visual measure for auditory multistability

    Get PDF
    In binocular rivalry, paradigms have been proposed for unobtrusive moment-by-moment readout of observers' perceptual experience (“no-report paradigms”). Here, we take a first step to extend this concept to auditory multistability. Observers continuously reported which of two concurrent tone sequences they perceived in the foreground: high-pitch (1008 Hz) or low-pitch (400 Hz) tones. Interstimulus intervals were either fixed per sequence (Experiments 1 and 2) or random with tones alternating (Experiment 3). A horizontally drifting grating was presented to each eye; to induce binocular rivalry, gratings had distinct colors and motion directions. To associate each grating with one tone sequence, a pattern on the grating jumped vertically whenever the respective tone occurred. We found that the direction of the optokinetic nystagmus (OKN)—induced by the visually dominant grating—could be used to decode the tone (high/low) that was perceived in the foreground well above chance. This OKN-based readout improved after observers had gained experience with the auditory task (Experiments 1 and 2) and for simpler auditory tasks (Experiment 3). We found no evidence that the visual stimulus affected auditory multistability. Although decoding performance is still far from perfect, our paradigm may eventually provide a continuous estimate of the currently dominant percept in auditory multistability

    Spatial attention increases performance but not subjective confidence in a discrimination task

    Get PDF
    Selective attention to a target yields faster and more accurate responses. Faster response times, in turn, are usually associated with increased subjective confidence. Could the decrease in reaction time in the presence of attention therefore simply reflect a shift toward more confident responses? We here addressed the extent to which attention modulates accuracy, processing speed, and confidence independently. To probe the effect of spatial attention on performance, we used two attentional manipulations of a visual orientation discrimination task. We demonstrate that spatial attention significantly increases accuracy, whereas subjective confidence measures reveal overconfidence in non-attended stimuli. At constant confidence levels, reaction times showed a significant decrease (by 15–49%, corresponding to 100–250 ms). This dissociation of objective performance and subjective confidence suggests that attention and awareness, as measured by confidence, are distinct, albeit related, phenomena

    A bottom–up model of spatial attention predicts human error patterns in rapid scene recognition

    Get PDF
    Humans demonstrate a peculiar ability to detect complex targets in rapidly presented natural scenes. Recent studies suggest that (nearly) no focal attention is required for overall performance in such tasks. Little is known, however, of how detection performance varies from trial to trial and which stages in the processing hierarchy limit performance: bottom–up visual processing (attentional selection and/or recognition) or top–down factors (e.g., decision-making, memory, or alertness fluctuations)? To investigate the relative contribution of these factors, eight human observers performed an animal detection task in natural scenes presented at 20 Hz. Trial-by-trial performance was highly consistent across observers, far exceeding the prediction of independent errors. This consistency demonstrates that performance is not primarily limited by idiosyncratic factors but by visual processing. Two statistical stimulus properties, contrast variation in the target image and the information-theoretical measure of “surprise” in adjacent images, predict performance on a trial-by-trial basis. These measures are tightly related to spatial attention, demonstrating that spatial attention and rapid target detection share common mechanisms. To isolate the causal contribution of the surprise measure, eight additional observers performed the animal detection task in sequences that were reordered versions of those all subjects had correctly recognized in the first experiment. Reordering increased surprise before and/or after the target while keeping the target and distractors themselves unchanged. Surprise enhancement impaired target detection in all observers. Consequently, and contrary to several previously published findings, our results demonstrate that attentional limitations, rather than target recognition alone, affect the detection of targets in rapidly presented visual sequences

    Salience-based object prioritization during active viewing of naturalistic scenes in young and older adults

    Get PDF
    Whether fixation selection in real-world scenes is guided by image salience or by objects has been a matter of scientific debate. To contrast the two views, we compared effects of location-based and object-based visual salience in young and older (65 + years) adults. Generalized linear mixed models were used to assess the unique contribution of salience to fixation selection in scenes. When analysing fixation guidance without recurrence to objects, visual salience predicted whether image patches were fixated or not. This effect was reduced for the elderly, replicating an earlier finding. When using objects as the unit of analysis, we found that highly salient objects were more frequently selected for fixation than objects with low visual salience. Interestingly, this effect was larger for older adults. We also analysed where viewers fixate within objects, once they are selected. A preferred viewing location close to the centre of the object was found for both age groups. The results support the view that objects are important units of saccadic selection. Reconciling the salience view with the object view, we suggest that visual salience contributes to prioritization among objects. Moreover, the data point towards an increasing relevance of object-bound information with increasing age

    The role of first- and second-order stimulus features for human overt attention

    Get PDF
    When processing complex visual input, human observers sequentially allocate their attention to different subsets of the stimulus. What are the mechanisms and strategies that guide this selection process? We investigated the influence of various stimulus features on human overt attention—that is, attention related to shifts of gaze with natural color images and modified versions thereof. Our experimental modifications, systematic changes of hue across the entire image, influenced only the global appearance of the stimuli, leaving the local features under investigation unaffected. We demonstrated that these modifications consistently reduce the subjective interpretation of a stimulus as "natural” across observers. By analyzing fixations, we found that first-order features, such as luminance contrast, saturation, and color contrast along either of the cardinal axes, correlated to overt attention in the modified images. In contrast, no such correlation was found in unmodified outdoor images. Second-order luminance contrast ("texture contrast”) correlated to overt attention in all conditions. However, although none of the second-order color contrasts were correlated to overt attention in unmodified images, one of the second-order color contrasts did exhibit a significant correlation in the modified images. These findings imply, on the one hand, that higher-order bottom-up effects—namely, those of second-order luminance contrast—may partially account for human overt attention. On the other hand, these results also demonstrate that global image properties, which correlate to the subjective impression of a scene being "natural,” affect the guidance of human overt attentio

    The relation of phase noise and luminance contrast to overt attention in complex visual stimuli

    Get PDF
    Models of attention are typically based on difference maps in low-level features but neglect higher order stimulus structure. To what extent does higher order statistics affect human attention in natural stimuli? We recorded eye movements while observers viewed unmodified and modified images of natural scenes. Modifications included contrast modulations (resulting in changes to first- and second-order statistics), as well as the addition of noise to the Fourier phase (resulting in changes to higher order statistics). We have the following findings: (1) Subjects' interpretation of a stimulus as a “natural” depiction of an outdoor scene depends on higher order statistics in a highly nonlinear, categorical fashion. (2) Confirming previous findings, contrast is elevated at fixated locations for a variety of stimulus categories. In addition, we find that the size of this elevation depends on higher order statistics and reduces with increasing phase noise. (3) Global modulations of contrast bias eye position toward high contrasts, consistent with a linear effect of contrast on fixation probability. This bias is independent of phase noise. (4) Small patches of locally decreased contrast repel eye position less than large patches of the same aggregate area, irrespective of phase noise. Our findings provide evidence that deviations from surrounding statistics, rather than contrast per se, underlie the well-established relation of contrast to fixation

    Pupil size signals novelty and predicts later retrieval success for declarative memories of natural scenes

    Get PDF
    Declarative memories of personal experiences are a key factor in defining oneself as an individual, which becomes particularly evident when this capability is impaired. Assessing the physiological mechanisms of human declarative memory is typically restricted to patients with specific lesions and requires invasive brain access or functional imaging. We investigated whether the pupil, an accessible physiological measure, can be utilized to probe memories for complex natural visual scenes. During memory encoding, scenes that were later remembered elicited a stronger pupil constriction compared to scenes that were later forgotten. Thus, pupil size predicts success or failure of memory formation. In contrast, novel scenes elicited stronger pupil constriction than familiar scenes during retrieval. When viewing previously memorized scenes, those that were forgotten (misjudged as novel) still elicited stronger pupil constrictions than those correctly judged as familiar. Furthermore, pupil constriction was influenced more strongly if images were judged with high confidence. Thus, we propose that pupil constriction can serve as a marker of novelty. Since stimulus novelty modulates the efficacy of memory formation, our pupil measurements during learning indicate that the later forgotten images were perceived as less novel than the later remembered pictures. Taken together, our data provide evidence that pupil constriction is a physiological correlate of a neural novelty signal during formation and retrieval of declarative memories for complex, natural scenes
    • …
    corecore